Hongshun Wang created FLINK-39732:
-------------------------------------

             Summary: Introduce TableDiscoverer SPI for flexible table 
subscription (with default JdbcTableDiscoverer)
                 Key: FLINK-39732
                 URL: https://issues.apache.org/jira/browse/FLINK-39732
             Project: Flink
          Issue Type: Improvement
          Components: Flink CDC
    Affects Versions: cdc-3.5.0
            Reporter: Hongshun Wang
             Fix For: cdc-3.6.0


h3. Background
Currently, Flink CDC only supports subscribing to source tables via a pattern 
(regex/glob) syntax. While this works well in many cases, it is not flexible 
enough for users who want to dynamically discover the tables to capture from 
the source database itself — for example, by querying the catalog of a 
PostgreSQL or MySQL instance, filtering by tags/owners, or applying 
business-specific selection rules.

 
h3. Motivation
 * Users often want to subscribe to "all tables in database X that match some 
runtime condition", which cannot be expressed cleanly with a static pattern.
 * Different databases (MySQL, PostgreSQL, Oracle, SQL Server, MongoDB, etc.) 
and different organizations have different rules for what "the right set of 
tables" means.
 * A pluggable discovery layer would let users (and connector authors) extend 
subscription behavior without touching CDC core code.

 
h3. Proposal
Introduce a new SPI: TableDiscoverer, responsible for resolving the concrete 
list of tables the CDC job should capture at job startup (and, optionally, 
periodically for newly created tables).
 
public interface TableDiscoverer extends Serializable \{
    /** Identifier used in connector options, e.g. "pattern", "jdbc", 
"custom-x". */
    String identifier();    /** Discover tables based on user configuration and 
source connection info. */
    List<TableId> discover(TableDiscoveryContext context);
}
 
 
h3. Default implementation: JdbcTableDiscoverer

 
A built-in JdbcTableDiscoverer gives MySQL, PostgreSQL, Oracle, SQL Server, 
etc. a sensible default discovery strategy out of the box, while still letting 
each dialect override edge cases (e.g., PostgreSQL publication filtering, MySQL 
information_schema shortcuts).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to