[ 
https://issues.apache.org/jira/browse/KAFKA-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491129#comment-16491129
 ] 

Sandeep Tamhankar commented on KAFKA-6811:
------------------------------------------

I'd suggest taking this one step further: I am working on a sink connector to 
push data to an external system. Establishing a connection/session in that 
system has a non-trivial cost, but once established, that session can be shared 
by all tasks. Currently, the task has no reference to the owning connector 
object, nor does the connector have the ability to send non-string objects to 
the task (e.g. Connector.taskConfigs only supports string values in the map).

Thus, a Connector developer has two choices:
 # Have each task create its own session to the external system.
 # Have a static member (session) on the Connector class that the Task can 
access when needed.

Option 1 is inefficient and unnecessarily uses resources in the Connector 
process as well as the external system.

Option 2 gets really ugly really fast: say you have two instances of the 
Connector with different configurations (connecting to different instances of 
the external service, for different topics). That single static member will no 
longer be appropriate – you need a static map, keyed on some unique identifier 
(specified in the connector config) to distinguish the session from one 
Connector instance from another. You need to expose a static method to allow 
tasks to access the session that is appropriate for them. Doable, yes, but 
quite a bit of arm-twisting.

> Tasks should have access to connector and task metadata
> -------------------------------------------------------
>
>                 Key: KAFKA-6811
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6811
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Jeremy Custenborder
>            Priority: Major
>
> As a connector developer it would be nice to have access to more metadata 
> about within a (Source|Sink)Task. For example I could use this to log task 
> specific data within the log. There are several connectors where I only run a 
> single task but would be able to do taskId() % totalTasks() for partitioning.
> High level I'm thinking something like this.
> {code:java}
> String connectorName();
> int taskId();
> int totalTasks();
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to