[ 
https://issues.apache.org/jira/browse/KAFKA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405059#comment-16405059
 ] 

Randall Hauch commented on KAFKA-3821:
--------------------------------------

Thanks, [~gunnar.morling]. I do tend to favor passing consumer implementations 
into methods rather than returning a list, but this is a fairly significant 
departure from the current implementation pattern for SourceTask. I guess we'd 
change Connect to call {{poll(SourceRecordReciever)}} which has a default 
implementation that calls the existing (and probably deprecated) {{poll()}}. Is 
that what you were thinking?

One other advantage of this is that we'd be able to also add other methods to 
the {{SourceRecordReceiver}}, such as {{startTransaction(sourcePartition, 
sourceOffsets)}}, {{commitTransaction(sourcePartition, sourceOffsets)}}, and 
{{rollbackTransaction(sourcePartition, sourceOffsets)}} that might be added as 
part of KAFKA-6080. We'd probably want to consider a different name for this 
interface, too.

On the implementation side, it'd be great if we could eliminate the allocation 
of a list, but that might not be possible. One big advantage of this approach, 
however, is that it would allow the WorkerSourceTask to much more easily define 
different behaviors for different configurations. For example, the current 
implementation writes the offsets to OffsetStorageWriter, which asynchronously 
and periodically commits them via the KafkaOffsetBackingStore. However, if we 
add EoS behavior in KAFKA-6080, then the offsets need to be written to Kafka 
within the EoS transaction. Bottom line is that the WorkerSourceTask may need 
to do things differently, and this might help encapsulate that different logic 
into different classes.

[~gunnar.morling], you asked about an alternative KIP. We don't yet have a KIP 
for this issue or for KAFKA-6080. IMO, it'd be better to start a KIP for 6080 
and then include this concept in the approach. Want to start that?

> Allow Kafka Connect source tasks to produce offset without writing to topics
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-3821
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3821
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>    Affects Versions: 0.9.0.1
>            Reporter: Randall Hauch
>            Priority: Major
>              Labels: needs-kip
>             Fix For: 1.2.0
>
>
> Provide a way for a {{SourceTask}} implementation to record a new offset for 
> a given partition without necessarily writing a source record to a topic.
> Consider a connector task that uses the same offset when producing an unknown 
> number of {{SourceRecord}} objects (e.g., it is taking a snapshot of a 
> database). Once the task completes those records, the connector wants to 
> update the offsets (e.g., the snapshot is complete) but has no more records 
> to be written to a topic. With this change, the task could simply supply an 
> updated offset.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to