[ 
https://issues.apache.org/jira/browse/SAMZA-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120083#comment-14120083
 ] 

Chris Riccomini commented on SAMZA-354:
---------------------------------------

bq. My guess would be that there's only one large shop running Samza jobs gated 
by the ops team. The ops-driven type would be easily scriptable.

Yea, I agree. The only difference really is whether the command runs forever or 
not. If it doesn't run forever, there's more coordination since the ops folks 
have to make sure the job is down before they run the migration. If it does run 
forever, though, I think it's risky since the tool might not be caught up when 
the job is re-deployed on 0.8.

I'm starting to favor the job-based approach: just have the 
KafkaCheckpointManager read from an 0.7 checkpoint topic if the 0.8 checkpoint 
topic is empty or doesn't exist. It actually wouldn't even need to do the 
migration--it could just read from 0.7 and populate the OffsetManager. From 
then on out, the normal offset commits will cause the checkpoints to get 
migrated to 0.8.

> Write tool to convert old-style checkpoint log to post-SAMZA-123 format
> -----------------------------------------------------------------------
>
>                 Key: SAMZA-354
>                 URL: https://issues.apache.org/jira/browse/SAMZA-354
>             Project: Samza
>          Issue Type: Task
>    Affects Versions: 0.8.0
>            Reporter: Jakob Homan
>            Assignee: David Chen
>
> After SAMZA-123, the checkpoint log has a new format (keyed entries 
> interspersed with statelog-partition mapping) and a new name.  It would be 
> simple to write a tool that would consume an old-style log and write out a 
> new-style log, using the GroupByPartition strategy.  This would allow 
> existing jobs to not lose checkpointing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to