Hongshun Wang created FLINK-36618:
-------------------------------------
Summary: Improve PostgresDialect.discoverDataCollections to reduce
the start time of Postgres CDC
Key: FLINK-36618
URL: https://issues.apache.org/jira/browse/FLINK-36618
Project: Flink
Issue Type: Improvement
Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: Hongshun Wang
Fix For: cdc-3.3.0
Current, PostgresDialect.discoverDataCollections will lookup tableSchemas for
each table.
{code:java}
@Override
public Map<TableId, TableChange> discoverDataCollectionSchemas(JdbcSourceConfig
sourceConfig) {
final List<TableId> capturedTableIds =
discoverDataCollections(sourceConfig);
try (JdbcConnection jdbc = openJdbcConnection(sourceConfig)) {
// fetch table schemas
Map<TableId, TableChange> tableSchemas = new HashMap<>();
for (TableId tableId : capturedTableIds) {
TableChange tableSchema = queryTableSchema(jdbc, tableId);
tableSchemas.put(tableId, tableSchema);
}
return tableSchemas;
} catch (Exception e) {
throw new FlinkRuntimeException(
"Error to discover table schemas: " + e.getMessage(), e);
}
}
{code}
I have a job with table name which match hundreds of table will start for 1
hour.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)