[
https://issues.apache.org/jira/browse/KAFKA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16404435#comment-16404435
]
Gunnar Morling edited comment on KAFKA-3821 at 3/19/18 7:07 AM:
----------------------------------------------------------------
Hey [~rhauch], another alternative API crossed my mind. It's a bigger deviation
from the current design but also is more flexible in terms of future extensions:
{code}
public interface WorkerSourceTask2 {
public void poll(SourceRecordReceiver receiver);
// other methods from WorkerSourceTask
public interface SourceRecordReceiver {
void sourceRecord(SourceRecord record);
void offset(Map<String, ?> sourcePartition, Map<String, ?>
sourceOffset);
}
}
{code}
I.e. instead of having {{SourceRecord}} subclasses, there'd be a new
receiver-type parameter which is invoked by {{poll()}} implementations to emit
any records. This also opens up the door for future other extensions, e.g. a
task might enforce commit of offsets at a given time. Adding methods like that
can be done in a compatible way via {{SourceRecordReceiver}} (as it's
implemented by KC, not connectors). As a small further plus, it also avoids
allocation of lists if just a single record must be emitted.
Any thoughts on that one? If you don't rule it out for some blocking reason I
oversaw, how should this be discussed then? As an alternative in the KIP
document?
was (Author: gunnar.morling):
Hey [~rhauch], another alternative API crossed my mind. It's a bigger deviation
from the current design but also is more flexible in terms of future extensions:
{code}
public interface WorkerSourceTask2 {
public void poll(SourceRecordReceiver receiver);
// other methods from WorkerSourceTask
public interface SourceRecordReceiver {
void sourceRecord(SourceRecord record);
void offset(Map<String, ?> sourcePartition, Map<String, ?>
sourceOffset);
}
}
{code}
I.e. instead of having {{SourceRecord}} subclasses, there'd be a new
receiver-type parameter which is invoked to emit any records. This also opens
up the door for future other extensions, e.g. a task might enforce commit of
offsets at a given time. Adding methods like that can be done in a compatible
way via {{SourceRecordReceiver}} (as it's implemented by KC, not connectors).
As a small further plus, it also avoids allocation of lists if just a single
record must be emitted.
Any thoughts on that one? If you don't rule it out for some blocking reason I
oversaw, how should this be discussed then? As an alternative in the KIP
document?
> Allow Kafka Connect source tasks to produce offset without writing to topics
> ----------------------------------------------------------------------------
>
> Key: KAFKA-3821
> URL: https://issues.apache.org/jira/browse/KAFKA-3821
> Project: Kafka
> Issue Type: Improvement
> Components: KafkaConnect
> Affects Versions: 0.9.0.1
> Reporter: Randall Hauch
> Priority: Major
> Labels: needs-kip
> Fix For: 1.2.0
>
>
> Provide a way for a {{SourceTask}} implementation to record a new offset for
> a given partition without necessarily writing a source record to a topic.
> Consider a connector task that uses the same offset when producing an unknown
> number of {{SourceRecord}} objects (e.g., it is taking a snapshot of a
> database). Once the task completes those records, the connector wants to
> update the offsets (e.g., the snapshot is complete) but has no more records
> to be written to a topic. With this change, the task could simply supply an
> updated offset.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)