[
https://issues.apache.org/jira/browse/KAFKA-15841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818530#comment-17818530
]
Henrique Mota commented on KAFKA-15841:
---------------------------------------
Hello [~gharris1727]!
My use case is as follows:
1: I have multiple clients in each environment, with the largest having 90
clients (databases). 2: Each client has a database in one application, and we
replicate approximately 100 tables from this database to another application's
database, with this other database being multi-tenant. 3: Previously, we had
one topic per table, with some partitions for each topic. So, we needed to
ensure that if any client had inconsistent data, we would pause the consumption
for that client and continue processing data for other clients. Thus, we
separated a topic with a partition for each table and client. We then created
an extension of the JDBC Sink that can pause a problematic topic, and after
some time attempt to resume consumption of the paused topic (we decided to use
one topic per client instead of one partition per client to facilitate
identification). 4: We have a JDBC Sink for each table. 5: We noticed that if
we add more than one worker, in this scenario, all topics were assigned to
worker 0, and the others were left waiting. 6: We tried to change the 'topics'
property in the configurations using the 'taskConfigs(int maxTasks)' method,
but Kafka Connect ignores this property when it is returned by 'taskConfigs(int
maxTasks)'.
> Add Support for Topic-Level Partitioning in Kafka Connect
> ---------------------------------------------------------
>
> Key: KAFKA-15841
> URL: https://issues.apache.org/jira/browse/KAFKA-15841
> Project: Kafka
> Issue Type: Improvement
> Components: connect
> Reporter: Henrique Mota
> Priority: Trivial
>
> In our organization, we utilize JDBC sink connectors to consume data from
> various topics, where each topic is dedicated to a specific tenant with a
> single partition. Recently, we developed a custom sink based on the standard
> JDBC sink, enabling us to pause consumption of a topic when encountering
> problematic records.
> However, we face limitations within Kafka Connect, as it doesn't allow for
> appropriate partitioning of topics among workers. We attempted a workaround
> by breaking down the topics list within the 'topics' parameter.
> Unfortunately, Kafka Connect overrides this parameter after invoking the
> {{taskConfigs(int maxTasks)}} method from the
> {{org.apache.kafka.connect.connector.Connector}} class.
> We request the addition of support in Kafka Connect to enable the
> partitioning of topics among workers without requiring a fork. This
> enhancement would facilitate better load distribution and allow for more
> flexible configurations, particularly in scenarios where topics are dedicated
> to different tenants.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)