[ 
https://issues.apache.org/jira/browse/SAMZA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Kleppmann reassigned SAMZA-180:
--------------------------------------

    Assignee: Martin Kleppmann

> Support one-time offset reset for a Samza job
> ---------------------------------------------
>
>                 Key: SAMZA-180
>                 URL: https://issues.apache.org/jira/browse/SAMZA-180
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>            Assignee: Martin Kleppmann
>
> Samza currently has a systems.%s.streams.%s.samza.reset.offset configuration. 
> When set to "true", this configuration tells each SamzaContainer to disregard 
> the checkpointed offsets for a stream when starting up. The problem with this 
> configuration is that the checkpoints are disregarded every time the 
> SamzaContainer starts up, not just the first time. If a host that a 
> SamzaContainer is running on fails, and YARN (or some other mechanism) 
> restarts the SamzaContainer, the container will not pick up where it left 
> off, but will instead disregard the checkpointed offsets, and start over 
> again, as before.
> There are some use-cases where developers wish to have a one-time reset of 
> the checkpointed offsets. That is, they want to reset the offsets exactly 
> once, but then have failures not trigger another reset. This is typically 
> useful in bootstrapping cases (related to SAMZA-179), where a developer 
> wishes to reset its task back to offset 0, and process all messages up to the 
> head of a stream, then shut down. Right now, the developer can set 
> reset.offset=true, and auto.offset.reset=smallest (if reprocessing a Kafka 
> topic), but if the container ever restarts, processing will begin again from 
> offset 0. This is not ideal.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to