Sihua Zhou created FLINK-7873:
---------------------------------
Summary: Introduce HybridStreamStateHandle for quick recovery from
checkpoint.
Key: FLINK-7873
URL: https://issues.apache.org/jira/browse/FLINK-7873
Project: Flink
Issue Type: New Feature
Components: State Backends, Checkpointing
Affects Versions: 1.3.2
Reporter: Sihua Zhou
Assignee: Sihua Zhou
Current recovery strategy will always read checkpoint data from remote
FileStream (HDFS). This will cost a lot of network when the state is so big
(e.g. 1T), this cost can be saved by reading the checkpoint data from local
disk. So i introduce a HybridStreamStateHandler which try to create a local
input stream first, if failed, it then create a remote input stream, it
prototype looks like below:
{code:java}
class HybridStreamHandle {
private FileStateHandle localHandle;
private FileStateHandle remoteHandle;
......
public FSDataInputStream openInputStream() throws IOException {
FSDataInputStream inputStream = localHandle.openInputStream();
return inputStream != null ? inputStream :
remoteHandle.openInputStream();
}
.....
}
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)