[ 
https://issues.apache.org/jira/browse/KAFKA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16404435#comment-16404435
 ] 

Gunnar Morling edited comment on KAFKA-3821 at 3/19/18 7:07 AM:
----------------------------------------------------------------

Hey [~rhauch], another alternative API crossed my mind. It's a bigger deviation 
from the current design but also is more flexible in terms of future extensions:

{code}
public interface WorkerSourceTask2 {

    public void poll(SourceRecordReceiver receiver);

    // other methods from WorkerSourceTask

    public interface SourceRecordReceiver {
        void sourceRecord(SourceRecord record);
        void offset(Map<String, ?> sourcePartition, Map<String, ?> 
sourceOffset);
    }
}
{code}

I.e. instead of having {{SourceRecord}} subclasses, there'd be a new 
receiver-type parameter which is invoked by {{poll()}} implementations to emit 
any records. This also opens up the door for future other extensions, e.g. a 
task might enforce commit of offsets at a given time. Adding methods like that 
can be done in a compatible way via {{SourceRecordReceiver}} (as it's 
implemented by KC, not connectors). As a small further plus, it also avoids 
allocation of lists if just a single record must be emitted.

Any thoughts on that one? If you don't rule it out for some blocking reason I 
oversaw, how should this be discussed then? As an alternative in the KIP 
document?


was (Author: gunnar.morling):
Hey [~rhauch], another alternative API crossed my mind. It's a bigger deviation 
from the current design but also is more flexible in terms of future extensions:

{code}
public interface WorkerSourceTask2 {

    public void poll(SourceRecordReceiver receiver);

    // other methods from WorkerSourceTask

    public interface SourceRecordReceiver {
        void sourceRecord(SourceRecord record);
        void offset(Map<String, ?> sourcePartition, Map<String, ?> 
sourceOffset);
    }
}
{code}

I.e. instead of having {{SourceRecord}} subclasses, there'd be a new 
receiver-type parameter which is invoked to emit any records. This also opens 
up the door for future other extensions, e.g. a task might enforce commit of 
offsets at a given time. Adding methods like that can be done in a compatible 
way via {{SourceRecordReceiver}} (as it's implemented by KC, not connectors). 
As a small further plus, it also avoids allocation of lists if just a single 
record must be emitted.

Any thoughts on that one? If you don't rule it out for some blocking reason I 
oversaw, how should this be discussed then? As an alternative in the KIP 
document?

> Allow Kafka Connect source tasks to produce offset without writing to topics
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-3821
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3821
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>    Affects Versions: 0.9.0.1
>            Reporter: Randall Hauch
>            Priority: Major
>              Labels: needs-kip
>             Fix For: 1.2.0
>
>
> Provide a way for a {{SourceTask}} implementation to record a new offset for 
> a given partition without necessarily writing a source record to a topic.
> Consider a connector task that uses the same offset when producing an unknown 
> number of {{SourceRecord}} objects (e.g., it is taking a snapshot of a 
> database). Once the task completes those records, the connector wants to 
> update the offsets (e.g., the snapshot is complete) but has no more records 
> to be written to a topic. With this change, the task could simply supply an 
> updated offset.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to