Users need not worry about checkpointing. Samza will automatically commit offsets every 60s. You can choose to commit more often by either 1. Setting task.commit.ms to a smaller value (or) 2. Doing manual commit yourself by setting task.commit.ms = -1. and calling taskCoordinator.commit();
I'm curious as to Why processing from the exact previous offset is unacceptable in your usecase? Let's say you process till offfset 100, and crash. Should you not want to resume from 100? On Tue, Mar 1, 2016 at 1:41 PM, Jeff Ramin <jeff.ra...@singlewire.com> wrote: > > > On 03/01/2016 03:10 PM, Jagadish Venkatraman wrote: > >> You don't have to implement any state checkpoint. Samza automatically >> checkpoints state for you. When you recover from a failure/restart you >> will >> resume processing from the previous checkpoint. >> > So, it's merely a configuration issue? > > What's your usecase? >> > > Pretty standard: have a consumer processing messages, which dies. When it > comes back up, > it needs to process messages not just from when it died, but perhaps 24 > hours prior to that time. > > > -- > Jeff Ramin > Software Engineer > Singlewire Software > 2601 W Beltline Hwy #510 > Madison, WI 53713 > > Phone Direct - 608.661.1172 > www.singlewire.com > > -- Jagadish V, Graduate Student, Department of Computer Science, Stanford University