[
https://issues.apache.org/jira/browse/SAMZA-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shekhar Sharma updated SAMZA-2657:
----------------------------------
Attachment: (was: SAMZA-Backup-Restore-Design-Doc.pdf)
> Blob store backed state backup and restore
> ------------------------------------------
>
> Key: SAMZA-2657
> URL: https://issues.apache.org/jira/browse/SAMZA-2657
> Project: Samza
> Issue Type: New Feature
> Reporter: Shekhar Sharma
> Assignee: Shekhar Sharma
> Priority: Major
> Time Spent: 3.5h
> Remaining Estimate: 0h
>
> *Problem:*
> **At LinkedIn we noticed jobs with large states takes a long time to restore
> (in the tune of hours) from kafka based changelog.
> *Solution*:
> **We propose a blob store based backup and restore for stateful jobs.
> Advantage of such a system is the ability to backup and restore state in
> parallel rather than one message at a time approach for a kafka based
> changelog. We implement a pluggable system that allows various blob stores
> that support PUT/GET/DELETE APIs to be easily plugged in as the backend for
> Samza state backup and restore.
> *Note:*
> At this time a general interface for Blob stores is provided for users and
> community to implement details of different blob store specific details.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)