[jira] [Commented] (KAFKA-15841) Add Support for Topic-Level Partitioning in Kafka Connect

Henrique Mota (Jira) Mon, 19 Feb 2024 08:41:32 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-15841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818530#comment-17818530
 ]


Henrique Mota commented on KAFKA-15841:
---------------------------------------

Hello [~gharris1727]!

My use case is as follows:

1: I have multiple clients in each environment, with the largest having 90 
clients (databases). 2: Each client has a database in one application, and we 
replicate approximately 100 tables from this database to another application's 
database, with this other database being multi-tenant. 3: Previously, we had 
one topic per table, with some partitions for each topic. So, we needed to 
ensure that if any client had inconsistent data, we would pause the consumption 
for that client and continue processing data for other clients. Thus, we 
separated a topic with a partition for each table and client. We then created 
an extension of the JDBC Sink that can pause a problematic topic, and after 
some time attempt to resume consumption of the paused topic (we decided to use 
one topic per client instead of one partition per client to facilitate 
identification). 4: We have a JDBC Sink for each table. 5: We noticed that if 
we add more than one worker, in this scenario, all topics were assigned to 
worker 0, and the others were left waiting. 6: We tried to change the 'topics' 
property in the configurations using the 'taskConfigs(int maxTasks)' method, 
but Kafka Connect ignores this property when it is returned by 'taskConfigs(int 
maxTasks)'.

> Add Support for Topic-Level Partitioning in Kafka Connect
> ---------------------------------------------------------
>
>                 Key: KAFKA-15841
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15841
>             Project: Kafka
>          Issue Type: Improvement
>          Components: connect
>            Reporter: Henrique Mota
>            Priority: Trivial
>
> In our organization, we utilize JDBC sink connectors to consume data from 
> various topics, where each topic is dedicated to a specific tenant with a 
> single partition. Recently, we developed a custom sink based on the standard 
> JDBC sink, enabling us to pause consumption of a topic when encountering 
> problematic records.
> However, we face limitations within Kafka Connect, as it doesn't allow for 
> appropriate partitioning of topics among workers. We attempted a workaround 
> by breaking down the topics list within the 'topics' parameter. 
> Unfortunately, Kafka Connect overrides this parameter after invoking the 
> {{taskConfigs(int maxTasks)}} method from the 
> {{org.apache.kafka.connect.connector.Connector}} class.
> We request the addition of support in Kafka Connect to enable the 
> partitioning of topics among workers without requiring a fork. This 
> enhancement would facilitate better load distribution and allow for more 
> flexible configurations, particularly in scenarios where topics are dedicated 
> to different tenants.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15841) Add Support for Topic-Level Partitioning in Kafka Connect

Reply via email to