[ 
https://issues.apache.org/jira/browse/SAMZA-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236034#comment-16236034
 ] 

ASF GitHub Bot commented on SAMZA-1480:
---------------------------------------

GitHub user jmakes opened a pull request:

    https://github.com/apache/samza/pull/350

    SAMZA-1480: TaskStorageManager improperly initializes changelog consu…

    …mer position when restoring a store from disk

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jmakes/samza samza-1480

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/350.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #350
    
----
commit bafe785eea7c8deb7b1766357fdada6449bde798
Author: Jacob Maes <[email protected]>
Date:   2017-11-02T15:58:33Z

    SAMZA-1480: TaskStorageManager improperly initializes changelog consumer 
position when restoring a store from disk

----


> TaskStorageManager improperly initializes changelog consumer position when 
> restoring a store from disk
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-1480
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1480
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Jake Maes
>            Assignee: Jake Maes
>             Fix For: 0.14.0
>
>
> For the Host Affinity state restore, an OFFSET file is written to disk on 
> each commit. This offset file contains the most recently written changelog 
> event which is also reflected in the on-disk state. When the container is 
> restarted, it restores the on-disk store and then replays the changelog from 
> the offset recorded in the OFFSET file in order to restore any changelog 
> events that were produced when the job ran on a different host. 
> http://samza.apache.org/learn/documentation/0.13/yarn/yarn-host-affinity.html
> When TaskStorageManager initializes the consumer, it uses the offset from the 
> OFFSET file, which is already reflected in the state. 
> Instead, it should use the SystemAdmin.getOffsetsAfter() method to get the 
> next offset to consume. This will avoid the replay of 1 extra message for 
> state restore.
> It should then use SystemAdmin.offsetComparator() to use the larger of the 
> next offset (calculated above) and the oldest offset (according to the 
> metadata). This is necessary for changelogs configured with TTL retention 
> rather than infinite retention where the offset from the OFFSET file may no 
> longer be valid. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to