I assume you meant Kafka offset -
https://kafka.apache.org/documentation/#intro_topics

Currently I don't think this is possible due to two reasons.

(1) Currently Kafka source can either read from a given topic or a set of
topic partitions, but not from a given offset -
https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L523
(2) Currently source has to be at the top of the pipeline graph to operate.
For example, you cannot initiate a Kafka source from a topic/partition read
from a database.

(2) should be possible when we have our next generation source framework,
SplittableDoFn. May be when we have that we can consider adding (1) as
well, if there are good justifications for that. I think the policy
regarding the offset to start reading from is configured in the Kafka
cluster and specifying a specific offset will not work if the corresponding
messages have been purged by the Kafka cluster, so I'm not sure how useful
adding support for reading from a given offset will be.

Thanks,
Cham

Reply via email to