[
https://issues.apache.org/jira/browse/SAMZA-557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yan Fang resolved SAMZA-557.
----------------------------
Resolution: Fixed
> Reuse local state in SamzaContainer on clean shutdown
> -----------------------------------------------------
>
> Key: SAMZA-557
> URL: https://issues.apache.org/jira/browse/SAMZA-557
> Project: Samza
> Issue Type: Sub-task
> Components: container
> Affects Versions: 0.9.0
> Reporter: Chris Riccomini
> Assignee: Navina Ramesh
> Attachments: SAMZA-557-0.patch, SAMZA-557-1.patch, SAMZA-557-2.patch
>
>
> Restoring state every time a SamzaContainer is restarted (due to failure, or
> re-deploy) can be expensive. Samza currently always restores state when a
> SamzaContainer starts. This could be avoided if the container is started on
> the same machine that it was running on before shutdown. It could re-use
> state that exists locally when it's restarted. There are two modes of state
> re-use:
> # A clean shutdown of the container has occurred.
> # An unclean shutdown of the container occurred.
> Re-using clean state (1) could be achieved by having the SamzaContainer write
> an OFFSET file for every local state directory when the SamzaContainer is
> shutdown (after the state stores have been stopped). A clean directory would
> look as follows:
> $PWD/state/my-kv-store/Partition-0/OFFSET
> The offset file must contain the offset of the last method in the changelog
> feed. This information is retrievable via the
> SystemAdmin.getSystemStreamMetadata method. When a SamzaContainer starts up,
> it can check if the OFFSET file exists for each store. If the OFFSET file
> does exist, the SamzaContainer can:
> # Instruct the state store to open the on-disk DB, rather than creating a new
> state store from scratch.
> # Read the OFFSET file.
> # Delete the OFFSET file.
> # Restore the state store from the OFFSET value, rather than from the oldest
> offset in the changelog (what Samza currently does).
> The OFFSET file must be removed (3) before any restoration is executed on the
> store. If this is not done, then a partial restoration might occur, followed
> by a failure. In such a case, non-idempotent writes to a store could result
> in inaccurate data being persisted to disk. The trade-off with deleting the
> OFFSET optimistically is that a failure during restoration will result in the
> whole state having to be restored, since the OFFSET file is gone. This is
> tolerable, since a failure during restoration is equivalent to an unclean
> shutdown, in which case you wouldn't expect the (possibly corrupted) local
> state to be used anyway.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)