[jira] [Commented] (FLINK-5715) Asynchronous snapshotting for HeapKeyedStateBackend

ASF GitHub Bot (JIRA) Thu, 09 Mar 2017 02:11:59 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902829#comment-15902829
 ]


ASF GitHub Bot commented on FLINK-5715:
---------------------------------------

Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/3466
  
    I think this is all in all very good code!
    
    One thing I am worried about is the testing time now. The 
`EventTimeWindowCheckpointingITCase` tests already take super long, now we have 
two more.
    
    What we should probably do is make the following:
      - The data volume is very high in that test, and I think that was mainly 
done to stress RocksDB's async snapshots a bit.
      - The heaviness can be moved to a RocksDB specific async snapshot test 
(that does not need to use windows)
      - The base of the EventTimeWindowCheckpointingITCases can then be made 
much more lightweight.



> Asynchronous snapshotting for HeapKeyedStateBackend
> ---------------------------------------------------
>
>                 Key: FLINK-5715
>                 URL: https://issues.apache.org/jira/browse/FLINK-5715
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.0
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>
> Blocking snapshots render the HeapKeyedStateBackend practically unusable for 
> many user in productions. Their jobs can not tolerate stopped processing for 
> the time it takes to write gigabytes of data from memory to disk. 
> Asynchronous snapshots would be a solution to this problem. The challenge for 
> the implementation is coming up with a copy-on-write scheme for the in-memory 
> hash maps that build the foundation of this backend. After taking a closer 
> look, this problem is twofold. First, providing CoW semantics for the hashmap 
> itself, as a mutible structure, thereby avoiding costly locking or blocking 
> where possible. Second, CoW for the mutable value objects, e.g. through 
> cloning via serializers.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5715) Asynchronous snapshotting for HeapKeyedStateBackend

Reply via email to