[
https://issues.apache.org/jira/browse/SAMZA-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Riccomini resolved SAMZA-568.
-----------------------------------
Resolution: Fixed
Fix Version/s: 0.9.0
+1 Merged and committed. Thanks!
> Start offset override in Task init
> ----------------------------------
>
> Key: SAMZA-568
> URL: https://issues.apache.org/jira/browse/SAMZA-568
> Project: Samza
> Issue Type: Improvement
> Components: container
> Affects Versions: 0.9.0
> Reporter: Ben Kirwin
> Assignee: Ben Kirwin
> Priority: Minor
> Fix For: 0.9.0
>
> Attachments:
> 0001-Allow-overriding-starting-offsets-in-TaskContext.patch,
> 0001-SAMZA-568-Start-offset-override.patch
>
>
> A couple months back -- [on the mailing list |
> http://mail-archives.apache.org/mod_mbox/incubator-samza-dev/201411.mbox/%3ccacux-d_zwzp2emqse4nou76skfh6bkifitzsmnm_b8dxjut...@mail.gmail.com%3E]
> -- I mentioned a couple offset management issues I'd been having. (I'm happy
> to elaborate on this, but in short: I associate some extra state / ordering
> information with the input offsets, and there's a nontrivial performance cost
> keeping Samza's checkpoints and my task's state in sync.)
> It occurs to me now that there's a simple workaround for this: disable
> Samza's checkpointing entirely, and let `StreamTask.init` choose the starting
> offsets. The task can just keep its checkpoints in an ordinary StorageEngine
> -- and by managing all the state from a single place, it's easy to keep
> everything in sync.
> The basic implementation actually seems fairly straightforward -- the
> consumers are not started until after the tasks are initialized, so all we'd
> need to do is allow the `init` method to override the starting offsets. I've
> attached a small patch that exposes this through the TaskContext interface,
> just to illustrate the idea -- if this seems like an interesting feature for
> Samza, I'm happy to add more tests / documentation / etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)