[ 
https://issues.apache.org/jira/browse/FLINK-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qin updated FLINK-4266:
----------------------------
    Summary: Remote Storage Statebackend  (was: Cassandra StateBackend)

> Remote Storage Statebackend
> ---------------------------
>
>                 Key: FLINK-4266
>                 URL: https://issues.apache.org/jira/browse/FLINK-4266
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.0.3, 1.2.0
>            Reporter: Chen Qin
>            Priority: Minor
>
> Current FileSystem statebackend limits whole state size to disk space. 
> For long running task that hold window content for long period of time, it 
> needs to split out states to durable remote storage and replicated across 
> data centers.
> We look into implementation from leverage checkpoint timestamp as versioning 
> and do range query to get current state; we also want to reduce "hot states" 
> hitting remote db per every update between adjacent checkpoints by tracking 
> update logs and merge them, do batch updates only when checkpoint; lastly, we 
> are looking for eviction policy that can identify "hot keys" in k/v state and 
> lazy load those "cold keys" from Cassandra.
> For now, we don't have good story regarding to data retirement. We might have 
> to keep forever until manually run command and clean per job related state 
> data. Some of features might related to incremental checkpointing feature, we 
> hope to align with effort there also.
> Welcome comments, I will try to put a draft design doc after gathering some 
> feedback.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to