[
https://issues.apache.org/jira/browse/SAMZA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Martin Kleppmann reassigned SAMZA-180:
--------------------------------------
Assignee: Martin Kleppmann
> Support one-time offset reset for a Samza job
> ---------------------------------------------
>
> Key: SAMZA-180
> URL: https://issues.apache.org/jira/browse/SAMZA-180
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.6.0
> Reporter: Chris Riccomini
> Assignee: Martin Kleppmann
>
> Samza currently has a systems.%s.streams.%s.samza.reset.offset configuration.
> When set to "true", this configuration tells each SamzaContainer to disregard
> the checkpointed offsets for a stream when starting up. The problem with this
> configuration is that the checkpoints are disregarded every time the
> SamzaContainer starts up, not just the first time. If a host that a
> SamzaContainer is running on fails, and YARN (or some other mechanism)
> restarts the SamzaContainer, the container will not pick up where it left
> off, but will instead disregard the checkpointed offsets, and start over
> again, as before.
> There are some use-cases where developers wish to have a one-time reset of
> the checkpointed offsets. That is, they want to reset the offsets exactly
> once, but then have failures not trigger another reset. This is typically
> useful in bootstrapping cases (related to SAMZA-179), where a developer
> wishes to reset its task back to offset 0, and process all messages up to the
> head of a stream, then shut down. Right now, the developer can set
> reset.offset=true, and auto.offset.reset=smallest (if reprocessing a Kafka
> topic), but if the container ever restarts, processing will begin again from
> offset 0. This is not ideal.
--
This message was sent by Atlassian JIRA
(v6.2#6252)