[ 
https://issues.apache.org/jira/browse/SAMZA-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324997#comment-14324997
 ] 

Chris Riccomini commented on SAMZA-568:
---------------------------------------

I agree. I also think that startProducers, startTask, startConsumers (and 
stopping in reverse order) correctly models the flow of data. Data comes in 
from consumers, through the task, and into the producer. Starting them in 
reverse order feels intuitively correct.

> Start offset override in Task init
> ----------------------------------
>
>                 Key: SAMZA-568
>                 URL: https://issues.apache.org/jira/browse/SAMZA-568
>             Project: Samza
>          Issue Type: Improvement
>          Components: container
>    Affects Versions: 0.9.0
>            Reporter: Ben Kirwin
>            Priority: Minor
>         Attachments: 
> 0001-Allow-overriding-starting-offsets-in-TaskContext.patch
>
>
> A couple months back -- [on the mailing list | 
> http://mail-archives.apache.org/mod_mbox/incubator-samza-dev/201411.mbox/%3ccacux-d_zwzp2emqse4nou76skfh6bkifitzsmnm_b8dxjut...@mail.gmail.com%3E]
>  -- I mentioned a couple offset management issues I'd been having. (I'm happy 
> to elaborate on this, but in short: I associate some extra state / ordering 
> information with the input offsets, and there's a nontrivial performance cost 
> keeping Samza's checkpoints and my task's state in sync.)
> It occurs to me now that there's a simple workaround for this: disable 
> Samza's checkpointing entirely, and let `StreamTask.init` choose the starting 
> offsets. The task can just keep its checkpoints in an ordinary StorageEngine 
> -- and by managing all the state from a single place, it's easy to keep 
> everything in sync.
> The basic implementation actually seems fairly straightforward -- the 
> consumers are not started until after the tasks are initialized, so all we'd 
> need to do is allow the `init` method to override the starting offsets. I've 
> attached a small patch that exposes this through the TaskContext interface, 
> just to illustrate the idea -- if this seems like an interesting feature for 
> Samza, I'm happy to add more tests / documentation / etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to