[
https://issues.apache.org/jira/browse/SAMZA-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115564#comment-14115564
]
Jakob Homan commented on SAMZA-354:
-----------------------------------
bq. One thing that might be nice with this tool is to leave it consuming from
the 0.7 checkpoint topic even after it's read up to the "latest" offset, rather
than having it shut itself down once it's caught up.
This seems like a lot of work for marginal benefit. Since Samza jobs store
their input in Kafka, they can handle a few minutes of downtime and running the
conversion tool should not take any significant time (read a de-duped kafka
topic, write a new kafka topic, done). I'm wary of extra complexity for
one-time-per-job operations.
> Write tool to convert old-style checkpoint log to post-SAMZA-123 format
> -----------------------------------------------------------------------
>
> Key: SAMZA-354
> URL: https://issues.apache.org/jira/browse/SAMZA-354
> Project: Samza
> Issue Type: Task
> Affects Versions: 0.8.0
> Reporter: Jakob Homan
> Assignee: David Chen
>
> After SAMZA-123, the checkpoint log has a new format (keyed entries
> interspersed with statelog-partition mapping) and a new name. It would be
> simple to write a tool that would consume an old-style log and write out a
> new-style log, using the GroupByPartition strategy. This would allow
> existing jobs to not lose checkpointing.
--
This message was sent by Atlassian JIRA
(v6.2#6252)