Marton Greber has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23451 )

Change subject: KUDU-3662 [6/n] Add createTable flag and table initialization 
logic
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/23451/1/java/kudu-replication/src/main/java/org/apache/kudu/replication/ReplicationTableInitializer.java
File 
java/kudu-replication/src/main/java/org/apache/kudu/replication/ReplicationTableInitializer.java:

http://gerrit.cloudera.org:8080/#/c/23451/1/java/kudu-replication/src/main/java/org/apache/kudu/replication/ReplicationTableInitializer.java@55
PS1, Line 55: sourceClient = new KuduClient.KuduClientBuilder(
            :             String. join(",", 
config.getSourceMasterAddresses())).build();
            :     sinkClient = new KuduClient.KuduClientBuilder(
            :             String.join(",", 
config.getSinkMasterAddresses())).build();
            :     sourceTableName = config.getTableName();
            :     sinkTableName = config.getSinkTableName();
            :     restoreOwner = config.getRestoreOwner();
> Why not simply store these as a private final ReplicationJobConfig config c
for the clients we need to have them instantiated in the initializer class. 
when we are done with table creation we can close them, there is no need to 
keep them for the full scope of the env provider.

for sourceTableName, sinkTableName, restoreOwner you are right we could just 
have the ReplicationJobConfig as a member variable and get these 3 variables at 
call sites.

Not sure if this answers your question.


http://gerrit.cloudera.org:8080/#/c/23451/1/java/kudu-replication/src/main/java/org/apache/kudu/replication/ReplicationTableInitializer.java@78
PS1, Line 78: <p>This method is a Java port of the Scala method with the same 
name from
            :    * the kudu-backup module in 
KuduRestore.createTableRangePartitionByRangePartition
> would it make sense to move this to a iseparate location so that both KuduR
Yes this was exactly my thought process. There is just one fundamental 
difference: when we do backup the table metadata is saved by backup 
(TableMetadata.scala), then during restore this file is used.

In replication we use Kudu clients on the fly to get the required table 
metadata, to then re-create the schema/partition schema.

The logical pieces are common but the refactor is not trivial. That is why I 
opted to duplicate the logic but with client usage here. But ideally the two 
should converge, created a ticket for it: KUDU-3702.



--
To view, visit http://gerrit.cloudera.org:8080/23451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1002b4ba272c1acaab351e3ff3f341ca327070d2
Gerrit-Change-Number: 23451
Gerrit-PatchSet: 1
Gerrit-Owner: Marton Greber <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Comment-Date: Wed, 24 Sep 2025 09:21:56 +0000
Gerrit-HasComments: Yes

Reply via email to