from:"\"Hayato Kuroda \\\(Fujitsu\\\)\""

RE: Using per-transaction memory contexts for storing decoded tuples

2024-09-20 Thread Hayato Kuroda (Fujitsu)

Dear Sawada-san,

> Thank you for your interest in this patch. I've just shared some
> benchmark results (with a patch) that could be different depending on
> the environment[1]. I would be appreciated if you also do similar
> tests and share the results.

Okay, I did similar tests, the attached script is the test runner. 
rb_mem_block_size
was changed from 8kB to 8MB. Below table show the result (millisecond unit).
Each cell is the average of 5 runs.

==
8kB 12877.4
16kB12829.1
32kB11793.3
64kB13134.4
128kB   13353.1
256kB   11664.0
512kB   12603.4
1MB 13443.8
2MB 12469.0
4MB 12651.4
8MB 12381.4
==

The standard deviation of measurements was 100-500 ms, there were no noticeable
differences on my env as well.

Also, I've checked the statistics of the generation context, and confirmed the
number of allocated blocks is x1000 higher if the block size is changed 
8kB->8MB.
[1] shows the output from MemoryContextStats(), just in case. IIUC, the 
difference
of actual used space comes from the header of each block. Each block has 
attributes
for management so that the total usage becomes larger based on the number.

[1]
8kB
Tuples: 724959232 total in 88496 blocks (1000 chunks); 3328 free (0 
chunks); 724955904 used
Grand total: 724959232 bytes in 88496 blocks; 3328 free (0 chunks); 724955904 
used

8MB
Tuples: 721420288 total in 86 blocks (1000 chunks); 1415344 free (0 
chunks); 720004944 used
Grand total: 721420288 bytes in 86 blocks; 1415344 free (0 chunks); 720004944 
use

Best regards,
Hayato Kuroda
FUJITSU LIMITED



test.sh
Description: test.sh

RE: [Proposal] Add foreign-server health checks infrastructure

2024-09-19 Thread Hayato Kuroda (Fujitsu)

Dear members,

(This mail is just a wrap-up)

I found that the final patch was pushed 2 days ago [1] and BF animals say OK for
now. Therefore, I've closed the CF entry as "committed". We can extend the
feature to other platforms, but I think it could be at another thread later.

Thanks everyone for many efforts!

[1]: 
https://github.com/postgres/postgres/commit/4f08ab55457751308ffd8d33e82155758cd0e304

Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: [Proposal] Add foreign-server health checks infrastructure

2024-09-16 Thread Hayato Kuroda (Fujitsu)

Dear Fujii-san,

Thanks for reviewing!

> I made a couple of small adjustments and attached the updated version.
> If that's ok, I'll go ahead and commit it.
> 
> + Name of the local user mapped to the foreign server of this
> + connection, or "public" if a public mapping is used. If the user
> 
> I enclosed "public" with  tag, i.e., public.

Right, it should be. I grepped sgml files just in case, but they are tagged by 
.

> > I did not done that be cause either of server_name or user_name is NULL and
> > it might be strange. But yes, the example should have more information.
> > Based on that, I added a tuple so that the example has below. Thought?
> >
> > loopback1 - user is "postgres", valid
> > loopback2 - user is "public", valid
> > loopback3 - user is NULL, invalid
> 
> LGTM.
> Also I added the following to the example for clarity:
> 
> postgres=# SELECT * FROM postgres_fdw_get_connections(true);

+1.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Using per-transaction memory contexts for storing decoded tuples

2024-09-16 Thread Hayato Kuroda (Fujitsu)

Hi,

> We have several reports that logical decoding uses memory much more
> than logical_decoding_work_mem[1][2][3]. For instance in one of the
> reports[1], even though users set logical_decoding_work_mem to
> '256MB', a walsender process was killed by OOM because of using more
> than 4GB memory.

I appreciate your work on logical replication and am interested in the thread.
I've heard this issue from others, and this has been the barrier to using 
logical
replication. Please let me know if I can help with benchmarking, other
measurements, etc.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Conflict detection and logging in logical replication

2024-08-20 Thread Hayato Kuroda (Fujitsu)

Dear Hou,

Thanks for updating the patch! I think the patch is mostly good.
Here are minor comments.

0001:

01.
```
+
+LOG:  conflict detected on relation "schemaname.tablename": 
conflict=conflict_type
+DETAIL:  detailed explaination.
...
+
```

I don't think the label is correct.  label should be used for the actual
example output, not for explaining the format. I checked several files like
amcheck.sgml and auto-exlain.sgml and func.sgml and they seemed to follow the
rule.

02.
```
+ 
+  The key section in the second sentence of the
...
```

I preferred that section name is quoted.

0002:

03.
```
-#include "replication/logicalrelation.h"
```

Just to confirm - this removal is not related with the feature but just the
improvement, right?


Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber

2024-08-09 Thread Hayato Kuroda (Fujitsu)

Dear Amit, Shveta, Hou,

Thanks for giving many comments! I've updated the patch.

> @@ -4409,6 +4409,14 @@ start_apply(XLogRecPtr origin_startpos)
>   }
>   PG_CATCH();
>   {
> + /*
> + * Reset the origin data to prevent the advancement of origin progress
> + * if the transaction failed to apply.
> + */
> + replorigin_session_origin = InvalidRepOriginId;
> + replorigin_session_origin_lsn = InvalidXLogRecPtr;
> + replorigin_session_origin_timestamp = 0;
> 
> Can't we call replorigin_reset() instead here?

I didn't use the function because arguments of calling function looked strange,
but ideally I can. Fixed.

> + /*
> + * Register a callback to reset the origin state before aborting the
> + * transaction in ShutdownPostgres(). This is to prevent the advancement
> + * of origin progress if the transaction failed to apply.
> + */
> + before_shmem_exit(replorigin_reset, (Datum) 0);
> 
> I think we need this despite resetting the origin-related variables in
> PG_CATCH block to handle FATAL error cases, right? If so, can we use
> PG_ENSURE_ERROR_CLEANUP() instead of PG_CATCH()?

There are two reasons to add a shmem-exit callback. One is to support a FATAL,
another one is to support the case that user does the shutdown request while
applying changes. In this case, I think ShutdownPostgres() can be called so that
the session origin may advance.

However, I think we cannot use 
PG_ENSURE_ERROR_CLEANUP()/PG_END_ENSURE_ERROR_CLEANUP
macros here. According to codes, it assumes that any before-shmem callbacks are
not registered within the block because the cleanup function is registered and 
canceled
within the macro. LogicalRepApplyLoop() can register the function when
it handles COMMIT PREPARED message so it breaks the rule.

Best regards,
Hayato Kuroda
FUJITSU LIMITED



v3-0001-Prevent-origin-progress-advancement-if-failed-to-.patch
Description:  v3-0001-Prevent-origin-progress-advancement-if-failed-to-.patch

RE: Found issues related with logical replication and 2PC

2024-08-08 Thread Hayato Kuroda (Fujitsu)

Dear Amit,

>
> The code changes look mostly good to me. I have changed/added a few
> comments in the attached modified version.
>

Thanks for updating the patch! It LGTM. I've tested your patch and confirmed
it did not cause the data loss. I used the source which was applied v3 and 
additional
fix to visualize the replication command [1].

Method
==

1. Construct a logical replication system with two_phase = true and
   synchronous_commit = false
2. attach a walwriter of the subscriber to stop the process
3. Start a transaction and prepare it for the publisher.
4. Wait until the worker replies to the publisher.
5. Stop the subscriber
6. Restart subscriber.
7. Do COMMIT PREPARED

Attached script can construct the same situation.

Result
==

After the step 5, I ran pg_waldump and confirmed PREPARE record existed on
the subscriber.

```
$ pg_waldump data_sub/pg_wal/00010001
...
rmgr: Transaction len..., desc: PREPARE gid pg_gid_16389_741: ...
rmgr: XLOGlen..., desc: CHECKPOINT_SHUTDOWN ...
```

Also, after the step 7, I confirmed that only the COMMIT PREPARED record
was sent because log output the below line. "75" means the ASCII character 'K';
this indicated that the replication message corresponded to COMMIT PREPARED.
```
LOG:  XXX got message 75
```



Additionally, I did another test, which is basically same as above but 1) 
XLogFlush()
in EndPrepare() was commented out and 2) kill -9 was used at step 5 to emulate a
crash. Since the PREPAREd transaction cannot survive on the subscriber in this 
case,
so COMMIT PREPARED command on publisher causes an ERROR on the subscriber.
```
ERROR:  prepared transaction with identifier "pg_gid_16389_741" does not exist
CONTEXT:  processing remote data for replication origin "pg_16389" during 
message
type "COMMIT PREPARED" in transaction 741, finished at 0/15463C0
```
I think this shows that the backend process can ensure the WAL is persisted so 
data loss
won't occur.


[1]:
```
@@ -3297,6 +3297,8 @@ apply_dispatch(StringInfo s)
 saved_command = apply_error_callback_arg.command;
 apply_error_callback_arg.command = action;
 
+elog(LOG, "XXX got message %d", action);
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED



test_0809.sh
Description: test_0809.sh

RE: Found issues related with logical replication and 2PC

2024-08-08 Thread Hayato Kuroda (Fujitsu)

Dear Amit, Shveta,

Thanks for discussing!

I reported the issue because 1) I feared the risk of data loss and 2) simply
because the coding looked incorrect. However, per discussion, I understood that
it wouldn't lead to loss, and adding a global variable was unacceptable in this
case. I modified the patch completely.

The attached patch avoids using the LastCommitLSN as the local_lsn while 
applying
PREPARE. get_flush_position() was not changed. Also, it contains changes that
have not been discussed yet:

- Set last_commit_end to InvaldXLogPtr in the PREPARE case.
  This causes the same result as when the stream option is not "parallel."
- XactLastCommitEnd was replaced even ROLLBACK PREPARED case.
  Since the COMMIT PREPARED record is flushed in 
RecordTransactionAbortPrepared(),
  there is no need to ensure the WAL must be sent.


Best regards,
Hayato Kuroda
FUJITSU LIMITED



v2-0001-Not-to-store-the-flush-position-of-the-PREPARE-re.patch
Description:  v2-0001-Not-to-store-the-flush-position-of-the-PREPARE-re.patch

RE: Found issues related with logical replication and 2PC

2024-08-07 Thread Hayato Kuroda (Fujitsu)

Dear Amit,

> Can we start a separate thread to issue 2? I understand that this one
> is also related to two_phase but since both are different issues it is
> better to discuss in separate threads. This will also help us refer to
> the discussion in future if required.

You are right, we should discuss one topic per thread. Forked: [1].

> BTW, why did the 0002 patch change the below code:
> --- a/src/include/replication/worker_internal.h
> +++ b/src/include/replication/worker_internal.h
> @@ -164,7 +164,8 @@ typedef struct ParallelApplyWorkerShared
> 
>   /*
>   * XactLastCommitEnd or XactLastPrepareEnd from the parallel apply worker.
> - * This is required by the leader worker so it can update the lsn_mappings.
> + * This is required by the leader worker so it can update the
> + * lsn_mappings.
>   */
>   XLogRecPtr last_commit_end;
>

Opps. Fixed version is posted in [1].

[1]: 
https://www.postgresql.org/message-id/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92%40TYAPR01MB5692.jpnprd01.prod.outlook.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED

[bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber

2024-08-07 Thread Hayato Kuroda (Fujitsu)

Dear hackers,

This thread forks from [1]. Here can be used to discuss second item.
Below part contains the same statements written in [1], but I did copy-and-paste
just in case. Attached patch is almost the same but bit modified based on the
comment
from Amit [2] - an unrelated change is removed.

Found issue
=
When the subscriber enables two-phase commit but doesn't set
max_prepared_transaction >0
and a transaction is prepared on the publisher, the apply worker reports an
ERROR
on the subscriber. After that, the prepared transaction is not replayed, which
means it's lost forever. Attached script can emulate the situation.

--
ERROR: prepared transactions are disabled
HINT: Set "max_prepared_transactions" to a nonzero value.
--

The reason is that we advanced the origin progress when aborting the
transaction as well (RecordTransactionAbort->replorigin_session_advance). So,
after setting replorigin_session_origin_lsn, if any ERROR happens when preparing
the transaction, the transaction aborts which incorrectly advances the origin
lsn.

An easiest fix is to reset session replication origin before calling the
RecordTransactionAbort(). I think this can happen when 1) LogicalRepApplyLoop()
raises an ERROR or 2) apply worker exits. Attached patch can fix the issue on
HEAD.

[1]:
https://www.postgresql.org/message-id/TYAPR01MB5692FA4926754B91E9D7B5F0F5AA2%40TYAPR01MB5692.jpnprd01.prod.outlook.com
[2]:
https://www.postgresql.org/message-id/CAA4eK1L-r8OKGdBwC6AeXSibrjr9xKsg8LjGpX_PDR5Go-A9TA%40mail.gmail.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED

v2-0001-Prevent-origin-progress-advancement-if-failed-to-.patch
Description: v2-0001-Prevent-origin-progress-advancement-if-failed-to-.patch

test_2pc.sh
Description: test_2pc.sh

RE: [Proposal] Add foreign-server health checks infrastructure

2024-08-07 Thread Hayato Kuroda (Fujitsu)

Dear Fujii-san,

Thanks for reviewing! PSA new version.

> 
>  postgres_fdw_get_connections(
>IN check_conn boolean DEFAULT false, OUT server_name text,
>OUT valid boolean, OUT used_in_xact boolean, OUT closed boolean)
>returns setof record
> 
> In the documentation, this part should be updated to include the user_name 
> output
> column.

Right, fixed.

> 
> 
> +user_name
> +text
> +
> + The local user name of this connection. If the user mapping is
> + dropped but the connection remains open (i.e., marked as
> + invalid), this will be NULL.
> 
> How about changing the first description to "Name of the local user mapped to 
> the
> foreign server of this connection, or "public" if a public mapping is used." 
> for more
> precision?

Added. I ran Grammarly and it said OK.

> - server_name | valid | used_in_xact | closed
> --+---+--+
> - loopback1   | t | t|
> - loopback2   | f | t|
> + server_name | user_name | valid | used_in_xact | closed
> +-+---+---+--+
> + loopback1   | postgres  | t | t|
> + loopback2   | postgres  | t | t|
> 
> How about displaying the record with loopback2 and valid=false like the 
> previous
> usage example?

I did not done that be cause either of server_name or user_name is NULL and
it might be strange. But yes, the example should have more information.
Based on that, I added a tuple so that the example has below. Thought?

loopback1 - user is "postgres", valid
loopback2 - user is "public", valid
loopback3 - user is NULL, invalid

> 
> 
> +UserMapping *
> +GetUserMappingByOid(Oid umid, bool missing_ok)
> 
> postgres_fdw doesn't need a generic function to return UserMapping. How about
> simplifying the function by removing unnecessary code, e.g., as follows?
> 
> --
> tp = SearchSysCache1(USERMAPPINGOID, ObjectIdGetDatum(umid));
> if (!HeapTupleIsValid(tp))
>  nulls[i++] = true;
> else
> {
>  Oid userid =  ((Form_pg_user_mapping) GETSTRUCT(tp))->userid;
>  values[i++] = CStringGetTextDatum(MappingUserName(userid));
>  ReleaseSysCache(tp);
> }
> --

Largely agreed, but some comments and Assertion() may be needed. Done.

> -ForeignTable *
> -GetForeignTable(Oid relid);
> -
> -
> - This function returns a ForeignTable object
> for
> - the foreign table with the given OID.  A
> - ForeignTable object contains properties of
> the
> - foreign table (see foreign/foreign.h for details).
> -
> -
> -
> -
> 
> Why did you remove these code? Just mistake?

Oh, my fault. I tried to remove GetUserMappingByOid() and the entry was also
Removed at that time. Restored.

Best regards,
Hayato Kuroda
FUJITSU LIMITED



v3-0001-Extend-postgres_fdw_get_connections-to-return-use.patch
Description:  v3-0001-Extend-postgres_fdw_get_connections-to-return-use.patch

RE: Conflict detection and logging in logical replication

2024-08-07 Thread Hayato Kuroda (Fujitsu)

Dear Hou,

While playing with the 0003 patch (the patch may not be ready), I found that
when the insert_exists event occurred, both apply_error_count and 
insert_exists_count
was counted.

```
-- insert a tuple on the subscriber
subscriber =# INSERT INTO tab VALUES (1);

-- insert the same tuple on the publisher, which causes insert_exists conflict
publisher =# INSERT INTO tab VALUES (1);

-- after some time...
subscriber =# SELECT * FROM pg_stat_subscription_stats;
-[ RECORD 1 ]+--
subid| 16389
subname  | sub
apply_error_count| 16
sync_error_count | 0
insert_exists_count  | 16
update_differ_count  | 0
update_exists_count  | 0
update_missing_count | 0
delete_differ_count  | 0
delete_missing_count | 0
stats_reset  |
```

Not tested, but I think this could also happen for the update_exists_count case,
or sync_error_count may be counted when the tablesync worker detects the 
conflict.

IIUC, the reason is that pgstat_report_subscription_error() is called in the
PG_CATCH part in start_apply() even after ReportApplyConflict(ERROR) is called.

What do you think of the current behavior? I wouldn't say I like that the same
phenomenon is counted as several events. E.g., in the case of vacuum, the entry
seemed to be separated based on the process by backends or autovacuum.
I feel the spec is unfamiliar in that only insert_exists and update_exists are
counted duplicated with the apply_error_count.

An easy fix is to introduce a global variable which is turned on when the 
conflict
is found.

Thought?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Conflict detection and logging in logical replication

2024-08-06 Thread Hayato Kuroda (Fujitsu)

Dear Hou,

> 
> Here is the V11 patch set which addressed above and Kuroda-san[1] comments.
>

Thanks for updating the patch. I read 0001 again and I don't have critical 
comments for now.
I found some cosmetic issues (e.g., lines should be shorter than 80 columns) and
attached the fix patch. Feel free to include in the next version.

Best regards,
Hayato Kuroda
FUJITSU LIMITED



fixes_for_v11.patch
Description: fixes_for_v11.patch

RE: [Proposal] Add foreign-server health checks infrastructure

2024-08-01 Thread Hayato Kuroda (Fujitsu)

Dear Fujii-san,

> Thanks for updating the patch!
> 
> > - Changed the name of new API from `GetUserMappingFromOid` to
> `GetUserMappingByOid`
> >to keep the name consistent with others.
> 
> If we expose this function as an FDW helper function, it should return
> a complete UserMapping object, including umoptions.
> 
> However, if postgres_fdw_get_connections() is the only user of this function,
> I'm not inclined to expose it as an FDW helper.

One reason is that the function does not handle any specific data for 
postgres_fdw,
however, there are no users and requirests from other projects. Based on that, 
ok,
we can move it to connection.c. If needed, we can export it again.

> Instead, we can either get
> the user ID by user mapping OID directly in connection.c using 
> SearchSysCache(),
> or add the user ID to ConnCacheEntry and use it in
> postgres_fdw_get_connections().
> Thought?

I moved the function to connection.c, which uses the SearchSysCache1().

I've tried both ways, and they worked well. One difference is that when we use
the extended ConnCacheEntry approach and the entry has been invalidated, we 
cannot
distinguish the reason. For example, in the below case, the entry is 
invalidated,
so the user_name of the output record will be NULL, whereas the user mapping is
actually still valid. We may be able to add the reason for invalidation, but
I'm not very motivated to modify the part.

```
BEGIN;
SELECT 1 FROM ft1 LIMIT 1; -- ft1 is at server "loopback"
...
ALTER SERVER loopback OPTIONS (ADD use_remote_estimate 'off');
...
SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
-> {"loopback", NULL, } will be returned
```

Another reason is that we can keep the code consistent with the server case.
The part does not read data from the entry, and we can follow.

Best regards,
Hayato Kuroda
FUJITSU LIMITED



v2-0001-Extend-postgres_fdw_get_connections-to-return-use.patch
Description:  v2-0001-Extend-postgres_fdw_get_connections-to-return-use.patch

RE: Conflict detection and logging in logical replication

2024-08-01 Thread Hayato Kuroda (Fujitsu)

Dear Hou,

Let me contribute the great feature. I read only the 0001 patch and here are 
initial comments.

01. logical-replication.sgml

track_commit_timestamp must be specified only on the subscriber, but it is not 
clarified.
Can you write down that?

02. logical-replication.sgml

I felt that the ordering of {exists, differ,missing} should be fixed, but not 
done.
For update "differ" is listerd after the "missing", but for delete, "differ"
locates before the "missing". The inconsistency exists on souce code as well.

03. conflict.h

The copyright seems wrong. 2012 is not needed.

04. general

According to the documentation [1], there is another constraint "exclude", which
can cause another type of conflict. But this pattern cannot be logged in detail.
I tested below workload as an example.

=
publisher=# create table tab (a int, EXCLUDE (a WITH =));
publisher=# create publication pub for all tables;

subscriber=# create table tab (a int, EXCLUDE (a WITH =));
subscriber=# create subscription sub...;
subscriber=# insert into tab values (1);

publisher=# insert into tab values (1);

-> Got conflict with below log lines:
```
ERROR:  conflicting key value violates exclusion constraint "tab_a_excl"
DETAIL:  Key (a)=(1) conflicts with existing key (a)=(1).
CONTEXT:  processing remote data for replication origin "pg_16389" during 
message type "INSERT"
for replication target relation "public.tab" in transaction 740, finished at 
0/1543940
```
=

Can we support the type of conflict?

[1]: 
https://www.postgresql.org/docs/devel/sql-createtable.html#SQL-CREATETABLE-EXCLUDE

Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: [BUG?] check_exclusion_or_unique_constraint false negative

2024-07-31 Thread Hayato Kuroda (Fujitsu)

Dear Michail,

Thanks for pointing out the issue!

>* RelationFindReplTupleByIndex
>
>Amit, this is why I've included you in this previously solo thread :)
>RelationFindReplTupleByIndex uses DirtySnapshot and may not find some records
>if they are updated by a parallel transaction. This could lead to lost
>deletes/updates, especially in the case of streaming=parallel mode. 
>I'm not familiar with how parallel workers apply transactions, so maybe this
>isn't possible.

IIUC, the issue can happen when two concurrent transactions using DirtySnapshot 
access
the same tuples, which is not specific to the parallel apply. Consider that two
subscriptions exist and publishers modify the same tuple of the same table.
In this case, two workers access the tuple, so one of the changes may be missed
by the scenario you said. I feel we do not need special treatments for parallel
apply.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Remove duplicate table scan in logical apply worker and code refactoring

2024-07-31 Thread Hayato Kuroda (Fujitsu)

Dear Hou,

> Thanks for reviewing the patch, and your understanding is correct.
> 
> Here is the updated patch 0001. I removed the comments as suggested by Amit.
> 
> Since 0002 patch is only refactoring the code and I need some time to review
> the comments for it, I will hold it until the 0001 is committed.

Thanks for updating the patch. I did a performance testing with v2-0001.

Before: 15.553 [s]
After:  7.593 [s]

I used the attached script for setting up. I used almost the same setting and 
synchronous
replication is used.

[machine]
CPU(s):120
Model name:Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz
Core(s) per socket:15
Socket(s): 4

Best regards,
Hayato Kuroda
FUJITSU LIMITED



test.sh
Description: test.sh

RE: make pg_createsubscriber option names more consistent

2024-07-31 Thread Hayato Kuroda (Fujitsu)

Dear Peter,

> I propose to rename the pg_createsubscriber option --socket-directory to
> --socketdir.  This would make it match the equivalent option in
> pg_upgrade.  (It even has the same short option '-s'.)
> pg_createsubscriber and pg_upgrade have a lot of common terminology and
> a similar operating mode, so it would make sense to keep this consistent.

+1. If so, should we say "default current dir." instead of "default current 
directory" in usage()
because pg_upgrade says like that?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Remove duplicate table scan in logical apply worker and code refactoring

2024-07-31 Thread Hayato Kuroda (Fujitsu)

Dear Hou,

Thanks for creating a patch!

> When reviewing the code in logical/worker.c, I noticed that when applying a
> cross-partition update action, it scans the old partition twice.
> I am attaching the patch 0001 to remove this duplicate table scan.

Just to clarify, you meant that FindReplTupleInLocalRel() are called in
apply_handle_tuple_routing() and 
apply_handle_tuple_routing()->apply_handle_delete_internal(),
which requires the index or sequential scan, right? LGTM.

> Apart from above, I found there are quite a few duplicate codes related to 
> partition
> handling(e.g. apply_handle_tuple_routing), so I tried to extract some
> common logic to simplify the codes. Please see 0002 for this refactoring.

IIUC, you wanted to remove the application code from 
apply_handle_tuple_routing()
and put only a part partition detection. Is it right? Anyway, here are comments.

01. apply_handle_insert()

```
+targetRelInfo = edata->targetRelInfo;
+
 /* For a partitioned table, insert the tuple into a partition. */
 if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
-apply_handle_tuple_routing(edata,
-   remoteslot, NULL, CMD_INSERT);
-else
-apply_handle_insert_internal(edata, edata->targetRelInfo,
- remoteslot);
+remoteslot = apply_handle_tuple_routing(edata, CMD_INSERT, remoteslot,
+&targetRelInfo);
+
+/* For a partitioned table, insert the tuple into a partition. */
+apply_handle_insert_internal(edata, targetRelInfo, remoteslot);
```

This part contains same comments, and no need to subsctitute in case of normal 
tables.
How about:

```
-/* For a partitioned table, insert the tuple into a partition. */
+/*
+ * Find the actual target table if the table is partitioned. Otherwise, use
+ * the same table as the remote one.
+ */
 if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
-apply_handle_tuple_routing(edata,
-   remoteslot, NULL, CMD_INSERT);
+remoteslot = apply_handle_tuple_routing(edata, CMD_INSERT, remoteslot,
+&targetRelInfo);
 else
-apply_handle_insert_internal(edata, edata->targetRelInfo,
- remoteslot);
+targetRelInfo = edata->targetRelInfo;
+
+/* Insert a tuple to the target table */
+apply_handle_insert_internal(edata, targetRelInfo, remoteslot);
```

02. apply_handle_tuple_routing()

```
 /*
- * This handles insert, update, delete on a partitioned table.
+ * Determine the partition in which the tuple in slot is to be inserted, and
...
```

But this function is called from delete_internal(). How about "Determine the
partition to which the tuple in the slot belongs."?

03. apply_handle_tuple_routing()

Do you have a reason why this does not return `ResultRelInfo *` but 
`TupleTableSlot *`?
Not sure, but it is more proper for me to return the table info because this is 
a
routing function. 

04. apply_handle_update()

```
+targetRelInfo = edata->targetRelInfo;
+targetrel = rel;
+remoteslot_root = remoteslot;
```

Here I can say the same thing as 1.

05. apply_handle_update_internal()

It looks the ordering of function's implementations are changed. Is it 
intentaional?

before

apply_handle_update
apply_handle_update_internal
apply_handle_delete
apply_handle_delete_internal
FindReplTupleInLocalRel
apply_handle_tuple_routing

after

apply_handle_update
apply_handle_delete
apply_handle_delete_internal
FindReplTupleInLocalRel
apply_handle_tuple_routing
apply_handle_update_internal

06. apply_handle_delete_internal()

```
+targetRelInfo = edata->targetRelInfo;
+    targetrel = rel;
+
```

Here I can say the same thing as 1.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

1 2 3 4 5 6 >

1 - 100 of 579 matches

Mail list logo