[
https://issues.apache.org/jira/browse/SAMZA-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Branislav Cogic reassigned SAMZA-856:
-------------------------------------
Assignee: Branislav Cogic
> Optionally automatically commit based on number of processed messages
> ---------------------------------------------------------------------
>
> Key: SAMZA-856
> URL: https://issues.apache.org/jira/browse/SAMZA-856
> Project: Samza
> Issue Type: Improvement
> Components: container
> Reporter: Elias Levy
> Assignee: Branislav Cogic
>
> Currently Samza support automatic checkpoint commits based on time via the
> task.commit.ms property. The number of messages processed during any time
> window will vary with the throughput of the system. Thus, the current
> automatic checkpointing can't guarantee a maximum number of messages being
> reprocessed when recovering after a failure.
> I propose the addition of an option that would automatically commit
> checkpoints after a configurable number of messages have been processed.
> The messages could be counted per container, per task, or per stream.
> Properties could be named task.commit.msg.container.cnt,
> task.commit.msg.task.cnt and/or task.commit.msg.stream.cnt.
> Alternatively, a per stream count limit could use different values for
> different streams. E.g. task.commit.msg.stream.<some_stream>.cnt=1000,
> task.commit.msg.stream.<some_other_stream>.cnt=200.
> A message count auto commit would be orthogonal to the existing time based
> auto commit and they could be used at the same time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)