VitoMakarevich commented on code in PR #10460:
URL: https://github.com/apache/hudi/pull/10460#discussion_r1526974645
##########
hudi-aws/src/main/java/org/apache/hudi/aws/sync/AWSGlueCatalogSyncClient.java:
##########
@@ -141,105 +156,196 @@ public AWSGlueCatalogSyncClient(HiveSyncConfig config) {
this.databaseName = config.getStringOrDefault(META_SYNC_DATABASE_NAME);
this.skipTableArchive =
config.getBooleanOrDefault(GlueCatalogSyncClientConfig.GLUE_SKIP_TABLE_ARCHIVE);
this.enableMetadataTable =
Boolean.toString(config.getBoolean(GLUE_METADATA_FILE_LISTING)).toUpperCase();
+ this.allPartitionsReadParallelism =
config.getIntOrDefault(ALL_PARTITIONS_READ_PARALLELISM);
+ this.changedPartitionsReadParallelism =
config.getIntOrDefault(CHANGED_PARTITIONS_READ_PARALLELISM);
+ this.changeParallelism =
config.getIntOrDefault(PARTITION_CHANGE_PARALLELISM);
+ }
+
+ private List<Partition> getPartitionsSegment(Segment segment, String
tableName) {
+ try {
+ List<Partition> partitions = new ArrayList<>();
+ String nextToken = null;
+ do {
+ GetPartitionsResponse result =
awsGlue.getPartitions(GetPartitionsRequest.builder()
+ .databaseName(databaseName)
+ .tableName(tableName)
+ .segment(segment)
Review Comment:
Yeah, I was planning to at least add new configurations to the public
documentation.
Will add tests as well.
As for removing dead code - I will try to take a closer look, the key
problem here is that part of the code is needed for Hive sync, so not sure
there is a lot to remove.
TBH it makes sense for me to use this approach by default once introduced,
but I understand that in order to be backward compatible it may be need to stay
under feature flag, the only problem is that there is one approach introduced
so far - pushdown, and while it's also kind of pushdown, this word will mean a
completely different thing for Hive and AWS. So maybe it's ok to leave it like
this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]